Tree Thinking

April Wright
08.09.2018

Good Morning!

  • What is a tree?
  • How is a tree built?
  • What are phylogenetic data?

What do we do with a phylogeny?

  • Determine the timing of trait evolution

Skink tree from Wright et al. 2015

What do we do with a phylogeny?

-Tell homology from convergence

Dolphin, Alex Vasenin via WikiMedia Dolphin

What do we do with a phylogeny?

-Taxonomy

Ask a Biologist

What do we do with a phylogeny?

-Taxonomy

  • Hennig, 1950 Grundzüge einer Theorie der Phylogenetischen Systematik
    • Taxonomy should be logically consistent with the tree for the group

What do we do with a phylogeny?

-Taxonomy

  • Hennig, 1950 Grundzüge einer Theorie der Phylogenetischen Systematik
    • Taxonomy should be logically consistent with the tree for the group
  • Sneath & Sokal, 1963, 1973
    • Using distance matrices to cluster based on phenetic similarity

Tree Terms: Tip

library(phytools)
tree <- pbtree(n = 5)
plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)

plot of chunk unnamed-chunk-1 Tip: What we are putting on the tree. May be species, individuals, or higher-order taxa. May be called terminal node, leaf, one degree node.

Tree Terms: Tip

library(phytools)
tree <- pbtree(n = 5)
plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)

plot of chunk unnamed-chunk-2 Branch: What connects the tip to the tree. Can have a variety of units, which we will discuss over the next few days. May be called edge.

Tree Terms: Node

library(phytools)
tree <- pbtree(n = 5)
plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)

plot of chunk unnamed-chunk-3 Node: Where nodes meet, implying a most recent common ancestor. May be called vertex, or three-degree node.

Tree Terms

plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5, direction = "downwards")

plot of chunk unnamed-chunk-4

Tree Terms

plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5, type="fan")

plot of chunk unnamed-chunk-5

Tree Terms: Rotation - reflecting taxa at a node

plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)
nodelabels(cex = 3.5)

plot of chunk unnamed-chunk-6

#rotateNodes(tree, c(8,9))
#plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)

Tree Terms: Monophyletic - an ancestor and all its descendents

is.monophyletic(tree, c("t1", "t2"), plot = TRUE, edge.width = 1.5, cex = 3.5, no.margin = TRUE)

plot of chunk unnamed-chunk-7

[1] FALSE

Tree Terms: Rooting

# reroot(tree, node.number)
plot(tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)

plot of chunk unnamed-chunk-8 Ingroup: Taxa of interest Outgroup: Taxon closely related used to root the tree

Tree Terms: Rooting

unroot_tree <- unroot(tree)
plot(unroot_tree, cex = 3.5, no.margin = TRUE, edge.width = 1.5)

plot of chunk unnamed-chunk-9

How is a tree built?

  • Many ways. We will focus on three:
    • Maximum parsimony
    • Maximum likelihood
    • Bayesian inference

Parsimony

  • Not only applied in phylogenetics
  • The simplest explanation for the observed data is the best

Phylogenetic Data

library(alignfigR)
char_data <- read_alignment("data/bears_fasta.fa")
char_data[1:3]
$Agriarctos_spp
 [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "0"
[18] "0" "0" "1" "1" "1" "1" "0" "0" "1" "?" "1" "1" "?" "0" "1" "1" "1"
[35] "1" "0" "1" "1" "0" "?" "?" "0" "1" "1" "1" "0" "?" "?" "?" "?" "?"
[52] "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?"

$Ailurarctos_lufengensis
 [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?"
[18] "0" "0" "1" "1" "1" "1" "0" "1" "1" "?" "1" "1" "?" "0" "?" "?" "?"
[35] "?" "0" "1" "1" "1" "?" "0" "0" "1" "1" "1" "0" "1" "0" "1" "1" "0"
[52] "1" "1" "?" "?" "?" "?" "?" "?" "?" "?" "?"

$Ailuropoda_melanoleuca
 [1] "1" "0" "1" "1" "1" "1" "0" "1" "1" "0" "1" "0" "0" "1" "0" "0" "0"
[18] "0" "0" "1" "1" "1" "1" "0" "1" "0" "1" "1" "1" "0" "0" "1" "0" "1"
[35] "0" "0" "1" "1" "0" "0" "0" "0" "1" "1" "1" "0" "1" "0" "0" "1" "0"
[52] "1" "1" "0" "0" "0" "1" "0" "0" "0" "1" "0"

Phylogenetic Data

library(alignfigR)
char_data <- read_alignment("data/bears_fasta.fa")
char_data[1:3]
$Agriarctos_spp
 [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "0"
[18] "0" "0" "1" "1" "1" "1" "0" "0" "1" "?" "1" "1" "?" "0" "1" "1" "1"
[35] "1" "0" "1" "1" "0" "?" "?" "0" "1" "1" "1" "0" "?" "?" "?" "?" "?"
[52] "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?"

$Ailurarctos_lufengensis
 [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?"
[18] "0" "0" "1" "1" "1" "1" "0" "1" "1" "?" "1" "1" "?" "0" "?" "?" "?"
[35] "?" "0" "1" "1" "1" "?" "0" "0" "1" "1" "1" "0" "1" "0" "1" "1" "0"
[52] "1" "1" "?" "?" "?" "?" "?" "?" "?" "?" "?"

$Ailuropoda_melanoleuca
 [1] "1" "0" "1" "1" "1" "1" "0" "1" "1" "0" "1" "0" "0" "1" "0" "0" "0"
[18] "0" "0" "1" "1" "1" "1" "0" "1" "0" "1" "1" "1" "0" "0" "1" "0" "1"
[35] "0" "0" "1" "1" "0" "0" "0" "0" "1" "1" "1" "0" "1" "0" "0" "1" "0"
[52] "1" "1" "0" "0" "0" "1" "0" "0" "0" "1" "0"

These data are binary

Phylogenetic Data

library(alignfigR)
char_data <- read_alignment("data/bears_fasta.fa")
char_data[1:3]
$Agriarctos_spp
 [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "0"
[18] "0" "0" "1" "1" "1" "1" "0" "0" "1" "?" "1" "1" "?" "0" "1" "1" "1"
[35] "1" "0" "1" "1" "0" "?" "?" "0" "1" "1" "1" "0" "?" "?" "?" "?" "?"
[52] "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?"

$Ailurarctos_lufengensis
 [1] "?" "0" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?" "?"
[18] "0" "0" "1" "1" "1" "1" "0" "1" "1" "?" "1" "1" "?" "0" "?" "?" "?"
[35] "?" "0" "1" "1" "1" "?" "0" "0" "1" "1" "1" "0" "1" "0" "1" "1" "0"
[52] "1" "1" "?" "?" "?" "?" "?" "?" "?" "?" "?"

$Ailuropoda_melanoleuca
 [1] "1" "0" "1" "1" "1" "1" "0" "1" "1" "0" "1" "0" "0" "1" "0" "0" "0"
[18] "0" "0" "1" "1" "1" "1" "0" "1" "0" "1" "1" "1" "0" "0" "1" "0" "1"
[35] "0" "0" "1" "1" "0" "0" "0" "0" "1" "1" "1" "0" "1" "0" "0" "1" "0"
[52] "1" "1" "0" "0" "0" "1" "0" "0" "0" "1" "0"

Always arranged with rows being taxa and columns corresponding to a character - “matrix” structure

Phylogenetic Data

library(ggplot2)
colors <- c("blue", "purple","white")
plot_alignment(char_data, colors, taxon_labels = TRUE) + theme(text = element_text(size=40))

plot of chunk unnamed-chunk-13

Phylogenetic Data

library(ggplot2)
colors <- c("blue", "purple","white")
plot_alignment(char_data, colors, taxon_labels = TRUE) + theme(text = element_text(size=40))

plot of chunk unnamed-chunk-14 How do we go from this to a tree?

Parsimony

  • Maximum parsimony: the tree that minimizes the number of “steps”, or changes, on a tree is to be preferred
  • Let's turn to the board for a minute: Parsimony informative, invariant, and parsimony non-informative variation

??? Have them start installs on the next page while we do this.

treesiftr

library(treesiftr)
aln_path <- "data/bears_fasta.fa"
bears <- read_alignment(aln_path)
tree <- read.tree("data/starting_tree.tre")

sample_df <- generate_sliding(bears, start_char = 1, stop_char = 5, steps = 1)
print(sample_df)
  starting_val stop_val step_val
1            1        2        1
2            2        3        1
3            3        4        1
4            4        5        1
5            5        6        1

treesiftr

library(phangorn)
library(ggtree)
output_vector <- generate_tree_vis(sample_df = sample_df, alignment =                                        aln_path,tree = tree, phy_mat = bears,                                    pscore = TRUE)
Final p-score 4 after  2 nni operations 
[1] 1 2
Final p-score 5 after  0 nni operations 
[1] 2 3
Final p-score 6 after  0 nni operations 
[1] 3 4
Final p-score 6 after  1 nni operations 
[1] 4 5
Final p-score 4 after  2 nni operations 
[1] 5 6

treesiftr

output_vector #sample output - you will get more than this when you run in your console
[[1]]

plot of chunk unnamed-chunk-17


[[2]]

plot of chunk unnamed-chunk-17


[[3]]

plot of chunk unnamed-chunk-17


[[4]]

plot of chunk unnamed-chunk-17


[[5]]

plot of chunk unnamed-chunk-17

??? Do a couple trees on the board, including the pruning algorithm. Then allow them to play.

Parsimony: Many trees for one character and 4 taxa

Parsimony Trees

??? This is one character. Imagine many - enumeration is not possible.Also note that several trees have the same “best” tree

Parsimony: How do we find the most parsimonious tree?

  • We're going to take an exercise break and play with PAUP

PAUP

execute data/bears_morphology.nex
  • NOTE: PAUP allows tab-completion
  • Open the bears_morphology file in a text editor. Now:

PAUP: A couple important commands

cstatus
tstatus
showmatrix
showdist
log file="mylogfile"
  • Try each of these - what information do they give you?

PAUP: Building a tree

alltrees

What happened here?

Parsimony: Enumeration is not possible for more than 12 taxa

Parsimony Trees

??? This is one character. Imagine many - enumeration is not possible.Also note that several trees have the same “best” tree

PAUP: Heuristic Searches

Heuristic - use of shortcuts to reduce the number of trees we need to search

hsearch
  • What is the name of the heuristic that was used?
  • How was the initial tree discovered?
  • How many trees were searched?
  • How many “best” trees were there, and what is their score?

PAUP: Heuristic Searches

Heuristic - use of shortcuts to reduce the number of trees we need to search

hsearch swap = nni
  • How many trees were examined with this algorithm? Why is this number so much smaller?
  • How many “best” trees were found, and what is their score?

PAUP: Heuristic Searches

Heuristic - use of shortcuts to reduce the number of trees we need to search

hsearch swap = spr
  • How many trees were examined with this algorithm?
  • How many “best” trees were found, and what is their score?
  • When would we expect searching algorithm to matter strongly?

PAUP: Exporting parsimony trees

savetrees from=1 to=1 file=results/tree1.tre;
savetrees from=2 to=2 file=results/tree2.tre;
savetrees from=3 to=3 file=results/tree3.tre;

PAUP: Exporting parsimony trees

library(gridExtra)
tree1 <- read.nexus("results/tree1.tre")
tree2 <- read.nexus("results/tree2.tre")
tree3 <- read.nexus("results/tree3.tre")
plot(tree1)

plot of chunk unnamed-chunk-25

PAUP: What do we do with multiple "best" trees?

Parsimony Trees

??? This is one character. Imagine many - enumeration is not possible.Also note that several trees have the same “best” tree

PAUP: What do we do with multiple "best" trees?

  • Typically: Build a consensus tree
  contree all / treefile=Results/contree.tre;
  help contree
  • Please look at:
    • Semistrict
    • Majority rule
    • Adams

PAUP: How do we assess confidence in a tree?

  • The bootstrap
char_df <- data.frame(char_data)
kable(char_df)
Agriarctos_spp Ailurarctos_lufengensis Ailuropoda_melanoleuca Ballusia_elmensis Helarctos_malayanus Indarctos_arctoides Indarctos_punjabiensis Indarctos_vireti Kretzoiarctos_beatrix Melursus_ursinus Tremarctos_ornatus Ursavus_brevirhinus Ursavus_primaevus Ursus_americanus Ursus_arctos Ursus_maritimus Ursus_thibetanus Zaragocyon_daamsi
? ? 1 ? 0 1 1 1 1 0 1 1 ? 0 0 0 0 0
0 0 0 0 1 1 1 1 1 1 1 0 ? 1 1 1 1 1
? ? 1 ? 0 1 1 ? ? 0 0 ? ? 0 0 0 0 ?
? ? 1 ? 0 1 1 1 ? 0 0 ? ? 0 0 0 0 ?
? ? 1 ? 0 1 1 1 ? 1 0 ? ? 0 0 0 1 ?
? ? 1 ? 0 1 1 1 ? 0 0 ? ? 0 0 0 0 ?
? ? 0 0 1 0 0 0 0 1 1 ? ? 1 1 1 1 ?
? ? 1 1 0 1 1 1 1 0 0 ? ? 0 0 0 1 ?
? ? 1 ? 0 1 1 1 ? 0 1 ? ? 0 0 0 0 ?
? ? 0 0 1 0 0 0 0 1 1 ? ? 1 1 1 1 0
? ? 1 0 1 1 1 1 1 0 0 ? ? 0 0 0 0 ?
? ? 0 0 0 0 0 0 0 1 0 ? ? 0 1 1 0 ?
? ? 0 0 0 0 0 0 0 0 1 ? ? 0 0 0 0 ?
? ? 1 0 1 1 1 1 1 0 1 ? ? 0 0 0 1 ?
? ? 0 0 0 0 0 0 0 0 0 0 ? 1 1 1 0 0
? ? 0 0 1 0 0 0 0 0 0 0 ? 1 1 1 1 0
0 ? 0 0 0 0 0 0 0 0 0 0 ? 0 1 1 0 0
0 0 0 0 1 1 1 1 0 1 1 1 ? 1 1 0 1 0
0 0 0 ? 0 0 0 0 0 0 0 ? ? 0 0 0 0 1
1 1 1 0 1 1 1 1 1 1 1 0 0 1 1 1 1 0
1 1 1 1 0 1 1 1 1 0 1 1 1 0 0 0 0 1
1 1 1 0 1 1 1 1 1 1 1 0 0 1 1 1 1 0
1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 1 0
0 1 1 0 0 1 1 0 0 1 1 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1
? ? 1 ? 0 1 1 1 ? 0 0 1 1 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 1
1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0
? ? 0 0 0 1 1 1 ? 0 1 0 0 0 1 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 0
1 ? 1 0 0 1 1 1 1 1 1 1 1 0 0 0 0 1
1 ? 0 1 0 1 1 1 1 1 0 1 1 0 0 0 1 1
1 ? 1 0 1 1 1 1 1 1 1 0 0 0 0 1 1 1
1 ? 0 0 1 1 1 1 1 0 1 1 1 0 0 1 1 1
0 0 0 ? 0 0 0 0 0 1 1 0 ? 0 0 0 0 0
1 1 1 ? 1 1 1 1 1 0 1 1 ? 1 1 1 1 0
1 1 1 0 0 1 1 1 1 0 0 ? ? 0 0 0 0 0
0 1 0 ? 0 0 0 0 0 0 1 0 0 0 1 1 1 ?
? ? 0 ? 1 0 0 0 0 0 0 ? ? 1 1 1 1 ?
? 0 0 ? 1 0 0 0 0 0 1 ? ? 1 1 1 1 ?
0 0 0 0 1 0 0 0 0 1 1 0 0 1 1 0 1 0
1 1 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0
1 1 1 0 0 1 1 1 1 1 1 0 0 0 1 0 0 1
1 1 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0
? 1 1 1 0 1 1 1 1 0 1 1 1 0 0 0 0 0
? 0 0 0 0 0 0 0 0 1 1 0 0 1 1 1 1 0
? 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
? 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
? 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 1 0
? 1 1 1 1 1 1 1 ? 1 1 1 1 1 1 1 1 0
? 1 1 0 1 1 1 1 ? 1 1 1 1 1 1 1 1 0
? ? 0 ? 0 0 0 0 ? 1 0 ? ? 0 0 0 0 0
? ? 0 ? 0 1 1 1 ? 1 0 ? ? 1 1 1 0 ?
? ? 0 ? 1 0 0 0 ? 1 1 ? ? 1 1 1 1 ?
? ? 1 ? 0 1 ? ? ? 1 1 ? ? 0 0 0 0 ?
? ? 0 ? 0 1 0 1 ? 1 0 ? ? 0 0 0 1 ?
? ? 0 ? 1 0 0 0 ? 1 1 ? ? 1 1 1 1 ?
? ? 0 ? 1 0 0 0 ? 1 1 ? ? 1 1 1 1 ?
? ? 1 ? 0 1 1 1 ? 0 1 ? ? 0 0 0 1 ?
? ? 0 ? 1 0 0 ? ? 1 0 ? ? 1 1 1 0 ?

Parsimony: Some of these trees imply homoplasy

Parsimony Trees

??? multiple hits, or superimposed changes